Spark Parameter Tuning via Trial-and-Error
نویسندگان
چکیده
Spark has been established as an attractive platform for big data analysis, since it manages to hide most of the complexities related to parallelism, fault tolerance and cluster setting from developers. However, this comes at the expense of having over 150 configurable parameters, the impact of which cannot be exhaustively examined due to the exponential amount of their combinations. The default values allow developers to quickly deploy their applications but leave the question as to whether performance can be improved open. In this work, we investigate the impact of the most important of the tunable Spark parameters on the application performance and guide developers on how to proceed to changes to the default values. We conduct a series of experiments with known benchmarks on the MareNostrum petascale supercomputer to test the performance sensitivity. More importantly, we offer a trialand-error methodology for tuning parameters in arbitrary applications based on evidence from a very small number of experimental runs. We test our methodology in three case studies, where we manage to achieve speedups of more than 10 times.
منابع مشابه
A Methodology for Spark Parameter Tuning
Spark has been established as an attractive platform for big data analysis, since it manages to hide most of the complexities related to parallelism, fault tolerance and cluster setting from developers. However, this comes at the expense of having over 150 configurable parameters, the impact of which cannot be exhaustively examined due to the exponential amount of their combinations. The defaul...
متن کاملParameter Tuning via Kernel Matrix Approximation for Support Vector Machine
Parameter tuning is essential to generalization of support vector machine (SVM). Previous methods usually adopt a nested two-layer framework, where the inner layer solves a convex optimization problem, and the outer layer selects the hyper-parameters by minimizing either cross validation or other error bounds. In this paper, we propose a novel parameter tuning approach for SVM via kernel matrix...
متن کاملA Unified IMC based PI/PID Controller Tuning Approach for Time Delay Processes
This paper proposes a new PI/PID controller tuning method within filtered Smith predictor (FSP) configuration in order to deal with various types of time delay processes including stable, unstable and integrating delay dominant and slow dynamic processes. The proposed PI/PID controller is designed based on the IMC principle and is tuned using a new constraint and without requiring any approxima...
متن کاملzTuned: Automated SQL Tuning through Trial and (Sometimes) Error
SQL tuning—the attempt to improve a poorly-performing execution plan produced by the database query optimizer—is a critical aspect of database performance tuning. Ironically, as commercial databases strive to improve on the manageability front, SQL tuning is becoming more of a black art. It requires a high level of expertise in areas like (i) query optimization, run-time execution of query plan...
متن کاملAdaptive Pso Based Lqr Tuning for Trajectory Tracking of Inverted Pendlum
The problem of state feedback control design is conventionally handled by pole assignment or Linear Quadratic Regulator (LQR) method via Algebraic Riccati Equation (ARE). However, these methods still suffer from the disadvantage of trial and error approach for parameter tuning. To be specific, selecting the weighting matrices Q and R of LQR has to be done by trial and error approach. Hence to a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016